Goto

Collaborating Authors

 Novosibirsk


Reciprocal Learning

Neural Information Processing Systems

These instances range from active learning over multi-armed bandits to self-training. We show that all these algorithms not only learn parameters from data but also vice versa: They iteratively alter training data in a way that depends on the current model fit. We introduce reciprocal learning as a generalization of these algorithms using the language of decision theory. This allows us to study under what conditions they converge.


Human-aligned Quantification of Numerical Data

Kolonin, Anton

arXiv.org Artificial Intelligence

Quantifying numerical data involves addressing two key challenges: first, determining whether the data can be naturally quantified, and second, identifying the numerical intervals or ranges of values that correspond to specific value classes, referred to as "quantums," which represent statistically meaningful states. If such quantification is feasible, continuous streams of numerical data can be transformed into sequences of "symbols" that reflect the states of the system described by the measured parameter. People often perform this task intuitively, relying on common sense or practical experience, while information theory and computer science offer computable metrics for this purpose. In this study, we assess the applicability of metrics based on information compression and the Silhouette coefficient for quantifying numerical data. We also investigate the extent to which these metrics correlate with one another and with what is commonly referred to as "human intuition." Our findings suggest that the ability to classify numeric data values into distinct categories is associated with a Silhouette coefficient above 0.65 and a Dip Test below 0.5; otherwise, the data can be treated as following a unimodal normal distribution. Furthermore, when quantification is possible, the Silhouette coefficient appears to align more closely with human intuition than the "normalized centroid distance" method derived from information compression perspective.


Adversarial Risk and Robustness: General Definitions and Implications for the Uniform Distribution

Dimitrios Diochnos, Saeed Mahloujifar, Mohammad Mahmoody

Neural Information Processing Systems

As the current literature contains multiple definitions of a dversarial risk and robustness, we start by giving a taxonomy for these definitions based on their direct goals; we identify one of them as the one guaranteeing miscla ssification by pushing the instances to the error region . We then study some classic algorithms for learning monotone conjunctions and compare their adversar ial robustness under different definitions by attacking the hypotheses using ins tances drawn from the uniform distribution. We observe that sometimes these defin itions lead to significantly different bounds. Thus, this study advocates for the use of the error-r egion definition, even though other definitions, in other contexts with context-dependent assumptions, may coincide with the error-region definition .


A survey and benchmark of high-dimensional Bayesian optimization of discrete sequences Miguel González-Duque

Neural Information Processing Systems

Optimizing discrete black box functions is key in several domains, e.g. protein engineering and drug design. Due to the lack of gradient information and the need for sample efficiency, Bayesian optimization is an ideal candidate for these tasks. Several methods for high-dimensional continuous and categorical Bayesian optimization have been proposed recently. However, our survey of the field reveals highly heterogeneous experimental set-ups across methods and technical barriers for the replicability and application of published algorithms to real-world tasks. To address these issues, we develop a unified framework to test a vast array of high-dimensional Bayesian optimization methods and a collection of standardized black box functions representing real-world application domains in chemistry and biology.





Adversarial Risk and Robustness: General Definitions and Implications for the Uniform Distribution

Dimitrios Diochnos, Saeed Mahloujifar, Mohammad Mahmoody

Neural Information Processing Systems

As the current literature contains multiple definitions of a dversarial risk and robustness, we start by giving a taxonomy for these definitions based on their direct goals; we identify one of them as the one guaranteeing miscla ssification by pushing the instances to the error region . We then study some classic algorithms for learning monotone conjunctions and compare their adversar ial robustness under different definitions by attacking the hypotheses using ins tances drawn from the uniform distribution. We observe that sometimes these defin itions lead to significantly different bounds. Thus, this study advocates for the use of the error-r egion definition, even though other definitions, in other contexts with context-dependent assumptions, may coincide with the error-region definition .